[python] voice communication for python help!

Posted by Eric on Stack Overflow See other posts from Stack Overflow or by Eric
Published on 2010-06-10T10:36:52Z Indexed on 2010/06/10 10:42 UTC
Read the original article Hit count: 541

Filed under:
|
|
|
|

Hello! I'm currently trying to write a voicechat program in python. All tips/trick is welcome to do this.

So far I found pyAudio to be a wrapper of PortAudio. So I played around with that and got an input stream from my microphone to be played back to my speakers. Only RAW of course.

But I can't send RAW-data over the netowrk (due the size duh), so I'm looking for a way to encode it. And I searched around the 'net and stumbled over this speex-wrapper for python. It seems to good to be true, and believe me, it was.

You see in pyAudio you can set the size of the chunks you want to take from your input audiobuffer, and in that sample code on the link, it's set to 320. Then when it's encoded, its like ~40 bytes of data per chunk, which is fairly acceptable I guess. And now for the problem.

I start a sample program which just takes the input stream, encodes the chunks, decodes them and play them (not sending over the network due testing). If I just let my computer idle and run this program it works great, but as soon as I do something, i.e start Firefox or something, the audio input buffer gets all clogged up! It just grows and then it all crashes and gives me an overflow error on the buffer..

OK, so why am I just taking 320 bytes of the stream? I could just take like 1024 bytes or something and that will easy the pressure on the buffer. BUT. If I give speex 1024 bytes of data to encode/decode, it either crashes and says that thats too big for its buffer. OR it encodes/decodes it, but the sound is very noisy and "choppy" as if it only encoded a tiny bit of that 1024 chunk and the rest is static noise. So the sound sounds like a helicopter, lol.

I did some research and it seems that speex only can convert 320 bytes of data at time, and well, 640 for wide-band. But that's the standard? How can I fix this problem? How should I construct my program to work with speex? I could use a middle-buffer tho that takes all available data to read from the buffer, then chunk this up in 320 bits and encode/decode them. But this takes a bit longer time and seems like a very bad solution of the problem..

Because as far as I know, there's no other encoder for python that encodes the audio so it can be sent over the network in acceptable small packages, or? I've been googling for three days now.

Also there is this pyMedia library, I don't know if its good to convert to mp3/ogg for this kind of software.

Thank in in advance for reading this, hope anyone can help me! (:

© Stack Overflow or respective owner

Related posts about python

Related posts about network